Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for NodeLease feature being promoted to GA #17189

Merged
merged 1 commit into from
Nov 18, 2019

Conversation

wojtek-t
Copy link
Member

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 25, 2019
@k8sio-netlify-preview-bot
Copy link
Collaborator

k8sio-netlify-preview-bot commented Oct 25, 2019

Deploy preview for kubernetes-io-vnext-staging processing.

Building with commit 718d644

https://app.netlify.com/sites/kubernetes-io-vnext-staging/deploys/5dd284e1b4847e0008bd7af2

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. language/en Issues or PRs related to English language sig/docs Categorizes an issue or PR as relevant to SIG Docs. labels Oct 25, 2019
@wojtek-t
Copy link
Member Author

/milestone v1.17

@k8s-ci-robot
Copy link
Contributor

@wojtek-t: You must be a member of the kubernetes/website-milestone-maintainers GitHub team to set the milestone. If you believe you should be able to issue the /milestone command, please contact your Website milestone maintainers and have them propose you as an additional delegate for this responsibility.

In response to this:

/milestone v1.17

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -202,6 +200,9 @@ The following table contains feature gates for graduated or deprecated features.
| `MountPropagation` | `false` | Alpha | 1.8 | 1.9 |
| `MountPropagation` | `true` | Beta | 1.10 | 1.11 |
| `MountPropagation` | `true` | GA | 1.12 | - |
| `NodeLease` | `false` | Alpha | 1.12 | 1.13 |
| `NodeLease` | `true` | Beta | 1.14 | 1.16 |
| `NodeLease` | `true` | Beta | 1.17 | - |
| `PersistentLocalVolumes` | `false` | Alpha | 1.7 | 1.9 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hello @wojtek-t . Some initial suggestions:

  • Line 205: Is the state GA?
  • Line 177: the kubelet or the Kubelet?
  • Line 182: This sentence could be reworded and possibly add more explanation, spelling too. Does this make sense?
    Compared to the Node resource, the Lease object is lightweight. The Lease resource improves the performance of the node heartbeats as the cluster scales?

From the KEP:
We will use that object to represent node heartbeat - for each Node there will be a corresponding Lease object with Name equal to Node name in a newly created dedicated namespace

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rephrased

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Informal feedback

node lease is much more lightweight than NodeStatus, this feature makes node
heartbeat significantly cheaper from both scalability and performance
perspectives.
Each node has an associated `Lease` object in `kube-node-lease` namespace.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
Each node has an associated `Lease` object in `kube-node-lease` namespace.
Each Node has an associated `Lease` object in the `kube-node-lease` namespace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively (see other comment) omit this line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Node leases are renewed frequently while NodeStatus is reported from node to
master only where there is some change or enough time has passed (default is
5 minutes, which is longer than the default timeout of 40 seconds for
unreachable nodes). Since Lease is much more lighweigh object than Node,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Typo: “lighweigh”
  2. This page is about Nodes, and the reader could be totally new to Kubernetes. This page is a lot of detail to explain what a Node is. If anything, I'd prefer to slim it further.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a heading: “Node leases” plus 100 words on the control plane issuing Node leases to keep track of Node health / failures.

Does NodeStatus even need mentioning here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does - the reason is that for the last 5 years, NodeStatus was the only signal for heartbeat. Now this is only treated as an additional one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I tried to rephrase it a bit (applied suggestions by @kbhawkey ) and slightly better organize.

@makoscafee
Copy link
Contributor

/milestone 1.17

@k8s-ci-robot k8s-ci-robot added this to the 1.17 milestone Nov 4, 2019
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 8, 2019
Copy link
Member Author

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments applied - PTAL

node lease is much more lightweight than NodeStatus, this feature makes node
heartbeat significantly cheaper from both scalability and performance
perspectives.
Each node has an associated `Lease` object in `kube-node-lease` namespace.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Node leases are renewed frequently while NodeStatus is reported from node to
master only where there is some change or enough time has passed (default is
5 minutes, which is longer than the default timeout of 40 seconds for
unreachable nodes). Since Lease is much more lighweigh object than Node,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does - the reason is that for the last 5 years, NodeStatus was the only signal for heartbeat. Now this is only treated as an additional one.

@@ -202,6 +200,9 @@ The following table contains feature gates for graduated or deprecated features.
| `MountPropagation` | `false` | Alpha | 1.8 | 1.9 |
| `MountPropagation` | `true` | Beta | 1.10 | 1.11 |
| `MountPropagation` | `true` | GA | 1.12 | - |
| `NodeLease` | `false` | Alpha | 1.12 | 1.13 |
| `NodeLease` | `true` | Beta | 1.14 | 1.16 |
| `NodeLease` | `true` | Beta | 1.17 | - |
| `PersistentLocalVolumes` | `false` | Alpha | 1.7 | 1.9 |
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rephrased

Node leases are renewed frequently while NodeStatus is reported from node to
master only where there is some change or enough time has passed (default is
5 minutes, which is longer than the default timeout of 40 seconds for
unreachable nodes). Since Lease is much more lighweigh object than Node,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I tried to rephrase it a bit (applied suggestions by @kbhawkey ) and slightly better organize.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 8, 2019
Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some feedback - hope it's useful.

#### Heartbeats

Each Node has an associated `Lease` object in `kube-node-lease` namespace.
It is periodically renewed by the kubelet and both NodeStatus and the Lease
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It is periodically renewed by the kubelet and both NodeStatus and the Lease
The kubelet updates its Lease object frequently to show that its Node is healthy. In the control plane, both NodeStatus and the Lease

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Each Node has an associated `Lease` object in `kube-node-lease` namespace.
It is periodically renewed by the kubelet and both NodeStatus and the Lease
are treated as heartbeats from the node.
Node leases are renewed frequently while NodeStatus is reported from node to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If accepting the previous suggestion, how about this wording?

The kubelet updates NodeStatus either when there is change in status, or
if there has been no update for a configured interval. The default interval
for NodeStatus updates is 5 minutes (much longer than the 40 second
default timeout for unreachable nodes).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Node leases are renewed frequently while NodeStatus is reported from node to
master only where there is some change or enough time has passed (default is
5 minutes, which is longer than the default timeout of 40 seconds for
unreachable nodes). Compared to the Node resource, the Lease object is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compared to the Node resource, the Lease object is lightweight. The Lease resource improves the performance of the node heartbeats as the cluster scales.

As NodeLease is GA (once this is merged, at least), and the oldest supported Kubernetes version will have NodeLease enabled by default, it's less important to explain the older method. I think it's OK to assume that readers are using NodeLease and that it pretty much “just works” for them.
I would cut out these 2 sentences.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure - this shows why this is even needed.

perspectives.
#### Heartbeats

Each Node has an associated `Lease` object in `kube-node-lease` namespace.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Each Node has an associated `Lease` object in `kube-node-lease` namespace.
Each Node has an associated Lease object in the `kube-node-lease` {{< glossary_tooltip term_id="namespace" text="namespace">}}.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@wojtek-t
Copy link
Member Author

wojtek-t commented Nov 8, 2019

PTAL

@wojtek-t wojtek-t changed the title [WIP] [Placeholder] Promote NodeLease feature to GA Documentation for NodeLease feature being promoted to GA Nov 12, 2019
@wojtek-t
Copy link
Member Author

@kbhawkey - PTAL

@wojtek-t
Copy link
Member Author

PTAL

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 13, 2019
@daminisatya
Copy link
Contributor

@sftim review from your side? Is it a lgtm?

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with these changes.

@wojtek-t
Copy link
Member Author

Thanks!
Can someone approve these changes then?

@sftim
Copy link
Contributor

sftim commented Nov 14, 2019

@daminisatya ?

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice a bunch of tiny, nit changes (mainly extra backticks). Also fine to merge as is, IMO.

Compared to the Node resource, the Lease is a lightweight resource, which improves
the performance of the node heartbeats as the cluster scales.

The `kubelet` is responsible for creating and updating `NodeStatus` and `Lease`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late, nit tweaks:

Suggested change
The `kubelet` is responsible for creating and updating `NodeStatus` and `Lease`.
The kubelet is responsible for creating and updating `NodeStatus` and a Lease.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


The `kubelet` is responsible for creating and updating `NodeStatus` and `Lease`.

- The `kubelet` updates the `NodeStatus` either when there is change in status,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The `kubelet` updates the `NodeStatus` either when there is change in status,
- The kubelet updates the `NodeStatus` either when there is change in status,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

or if there has been no update for a configured interval. The default interval
for `NodeStatus` updates is 5 minutes (much longer than the 40 second default
timeout for unreachable nodes).
- The `kubelet` creates and then updates the `Lease` object every 10 seconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The `kubelet` creates and then updates the `Lease` object every 10 seconds
- The kubelet creates and then updates its Lease object every 10 seconds

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

for `NodeStatus` updates is 5 minutes (much longer than the 40 second default
timeout for unreachable nodes).
- The `kubelet` creates and then updates the `Lease` object every 10 seconds
(the default update interval). `Lease` updates occur independently from the
Copy link
Contributor

@sftim sftim Nov 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(the default update interval). `Lease` updates occur independently from the
(the default update interval). Lease updates occur independently from the

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 14, 2019
Copy link
Member Author

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sftim - done; PTAL


The `kubelet` is responsible for creating and updating `NodeStatus` and `Lease`.

- The `kubelet` updates the `NodeStatus` either when there is change in status,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

or if there has been no update for a configured interval. The default interval
for `NodeStatus` updates is 5 minutes (much longer than the 40 second default
timeout for unreachable nodes).
- The `kubelet` creates and then updates the `Lease` object every 10 seconds
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

for `NodeStatus` updates is 5 minutes (much longer than the 40 second default
timeout for unreachable nodes).
- The `kubelet` creates and then updates the `Lease` object every 10 seconds
(the default update interval). `Lease` updates occur independently from the
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Compared to the Node resource, the Lease is a lightweight resource, which improves
the performance of the node heartbeats as the cluster scales.

The `kubelet` is responsible for creating and updating `NodeStatus` and `Lease`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Heartbeats, sent by Kubernetes nodes, help determine the availability of a node.
There are two forms of heartbeats: updates of `NodeStatus` and the
[Lease object](/docs/reference/generated/kubernetes-api/{{< latest-version >}}/#lease-v1-coordination-k8s-io).
Each Node has an associated `Lease` object in the `kube-node-lease`
Copy link
Contributor

@kbhawkey kbhawkey Nov 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good. If you are cleaning up nits ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubectl is 100% a command line tool. kubelet, whilst also an executable, isn't a tool that people usually run in a command line, at least not during typical cluster operation. I see kubelet fitting into the same slot as kube-scheduler, kube-controller-manager, and kube-proxy.

(I've used backticks for those names in this very comment but in the docs I typically wouldn't).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the Lease.

I agree with @sftim that kubelet is in the same category as kube-scheduler, etc.
[in this file kubelet is not in backsticks anywhere]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @wojtek-t . It feels like these changes should be ready 😀 . Okay to override these comments, but if you have the energy for one more change, I'd suggest splitting/adjusting lines 183 - 184.

Lease is a lightweight resource, which improves the performance of the node heartbeats as the cluster scales.
OR
Lease is a smaller resource than the Node, which improves the performance of the node heartbeats as the cluster scales.

Line 186: The kubelet is responsible for creating and updating the NodeStatus and a Lease object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@wojtek-t
Copy link
Member Author

@sftim @kbhawkey - PTAL

@wojtek-t
Copy link
Member Author

@sftim @kbhawkey - I applied the next comment. PTAL
[That said, can we have it merged? I have a feeling that we can do those iterations indefinitely...]

@kbhawkey
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 18, 2019
@kbhawkey
Copy link
Contributor

😁
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kbhawkey

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 18, 2019
@k8s-ci-robot k8s-ci-robot merged commit 23d2123 into kubernetes:dev-1.17 Nov 18, 2019
@wojtek-t
Copy link
Member Author

@kbhawkey - thanks a lot!

mrbobbytables pushed a commit that referenced this pull request Dec 6, 2019
k8s-ci-robot pushed a commit that referenced this pull request Dec 10, 2019
* feat: graduate TaintNodesByCondition to GA (#17073)

* Promote StartupProbe to beta (enabled by default). (#17164)

* Watch bookmarks to GA (#17026)

* feat: graduate ScheduleDaemonSetPods to GA (#17350)

* Update Docker installation instructions (#17405)

* Use exact version numbers for installing Docker in Ubuntu (#17428)

* Move CSIMigration and CSIMigrationGCE to Beta in Kubernetes v1.17 (#17478)

* Promote NodeLease feature to GA (#17189)

* Update docs for csi topology ga (#17408)

* Update RunAsUsername to beta (#17460)

* doc:Update RunAsUsername to beta

* doc: update samples - kubernetes.io/os is no longer beta

* Updating based on review feedback

* Promote Node-specific volume limits to GA (#17432)

* Promote PodShareProcessNamespace to stable (#17192)

* Promote PodShareProcessNamespace to stable

* Add for_k8s_version to feature-state label

Co-Authored-By: Tim Bannister <[email protected]>

* Readd version-check to shareProcessNamespace task

* Update service load balancer finalizer doc for GA (#17438)

* Update Topology Manager docs (#17451)

* Added information on how device plugins can take advantage
of Topology Manager
* Updated the Topology Manager documentation to include additionalinformation and update some out of date sections

* Fix broken Topology Manager link (#17746)

Part of What's Next Device Plugin section

* Update CRD defaulting docs for GA (#17450)

* Add documentation for VolumeSnapshot Beta (#17233)

* Updating EndpointSlice documentation for beta release in 1.17 (#17411)

* (docs/dualstack): v1.17 updates (#17457)

* Add placehold doc updates for dualstack in 1.17

Signed-off-by: Lachlan Evenson <[email protected]>

* Add Downward API and /etc/hosts Pod IP validation

Signed-off-by: Lachlan Evenson <[email protected]>

* remove addressed known issue via k/k pr 85246

Signed-off-by: Lachlan Evenson <[email protected]>

* Remove known issue and add flag as part of k/k 79993

Signed-off-by: Lachlan Evenson <[email protected]>

* remove follow up placeholders

Signed-off-by: Lachlan Evenson <[email protected]>

* Update verbiage

Signed-off-by: Lachlan Evenson <[email protected]>

* Make IP addressing consistent throughout the task

Signed-off-by: Lachlan Evenson <[email protected]>

* Update to status.podIPs

Signed-off-by: Lachlan Evenson <[email protected]>

* Update content/en/docs/tasks/network/validate-dual-stack.md

Use set instead of env

Co-Authored-By: Khaled Henidak (Kal) <[email protected]>

* add topology.kubernetes.io/zone, topology.kubernetes.io/region and node.kubernetes.io/instance-type labels to docs (#17498)

Signed-off-by: Andrew Sy Kim <[email protected]>

* Service topology alpha documentation (#17459)

* Update list of feature flags for in-tree plugins migrated to CSI (#17533)

Signed-off-by: Deep Debroy <[email protected]>

* Update Node concept for TaintNodesByCondition going GA (#17577)

* feat: graduate ResourceQuotaScopeSelectors to GA in 1.17 (#17554)

* kubeadm: update the upgrade documentation for 1.17 (#17587)

* doc: Simplify Windows deployments with RuntimeClass (#16697)

* doc: Simplify Windows deployments with RuntimeClass

* Updating on review feedback

* doc: Adding windows-build label from enhancement 1301

* update doc for kubelet option --reserved-cpus (#17648)

* feat: update TaintNodesByCondition in feature gates table (#17377)

* Update docs for v1 resource quota configuration (#17547)

* AdmissionConfiguration v1 (#17548)

* Update WebhookAdmissionConfiguration examples (#17549)

* Update AWS EBS Migration Feature state (#16126)

* Add resource version section to api-concepts documentation (#16910)

* Add Resource Version semantics section to api concepts

* Clarify risks of going back in time, add details about compaction and watch cache sizes

* Apply suggestions from liggitt

Co-Authored-By: Jordan Liggitt <[email protected]>

* remove pesudocode, apply feedback

* Fix typo

* Clarify equality rules

* Cleanup kubectl generators docs (#17609)

* Write ReplicationController without a space

* Drop mentioning unsupported cluster versions

* Fix capitalization for “API group”

* Tweak wording

* Avoid using deprecated generator in example

* add Antrea description in dev-1.17 (#17919)

* Promote VolumeSubpathEnvExpansion to GA

* Reference Documentation for the Kubernetes API for 1.17 (#18019)

* Update feature-gates.md (#18033)

* Reference Documentation for kubectl Commands for 1.17 (#18017)

* Update for v1.17 (#18034)

* Update config.toml(release-1.17) for 1.17 (#18031)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/docs Categorizes an issue or PR as relevant to SIG Docs. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants